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Abstract: We show that future Uhra-High Energy Cosmic Ray samples should be able to distinguish 
whether the sources of UHECRs are hosted by galaxy clusters or ordinary galaxies, or whether the sources 
are uncorrelated with the large-scale structure of the universe. Moreover, this is true independently of 
arrival direction uncertainty due to magnetic deflection or measurement error. The reason for this is the 
simple property that the strength of large-scale clustering for extragalactic sources depends on their mass, 
with more massive objects, such as galaxy clusters, clustering more strongly than lower mass objects, 
such as ordinary galaxies. 



Introduction 

Identifying the sources of ultrahigh energy cosmic 
rays (UHECRs, here E > lO^^eV = 10 EeV) 
is complicated by the deflection they presumably 
experience in Galactic and extragalactic magnetic 
fields, as well as their relatively poor arrival di- 
rection determinations, typically ^ 1°. Arrival 
directions of most UHECRs are thus not known 
well enough to match their positions with spe- 
cific astrophysical objects. However, there is also 
useful information in the clustering of UHECRs 
on large scales, where ~ few degree uncertain- 
ties in position become unimportant. The cluster- 
ing of galaxies in the universe is typically quan- 
tified by the two-point correlation function or its 
analog in Fourier space, the power spectrum. The 
two-point correlation function ^(r) of any class of 
objects (e.g., galaxies of a certain luminosity or 
color) is defined as the excess number of pairs of 
such objects at physical separation r over that ex- 
pected for a random (Poisson) distribution. In Cold 
Dark Matter models, the large-scale amplitude of 
^(r) (usually referred to as the bias) of a popula- 
tion of objects depends only on their mass, with 
more massive objects, such as clusters of galax- 
ies, clustering more strongly than less massive ob- 
jects, such as ordinary galaxies [5, 6, 1]. The large- 
scale bias of a UHECR sample is therefore a robust 



and informative measure of the clustering proper- 
ties of the source. We cannot measure physical 
separations for pairs involving UHECRs because 
they do not have measured redshifts. However, 
we can measure the angular correlation function 
U!{9). As is the case for ^(r), the large-scale am- 
plitude of Ljj{d) for a UHECR sample depends on 
the nature of the astrophysical source. However, it 
also depends on the depth of the sample because 
deeper samples mix more physically uncorrelated 
pairs and thus show weaker angular clustering. In 
order to access the information in the large-scale 
angular clustering of UHECRs, we must therefore 
know the depth of our UHECR sample. In this pa- 
per, we demonstrate what can be learned from the 
large-scale angular clustering of UHECRs, we esti- 
mate what kind of sample is needed to do this anal- 
ysis, and we show how to deal with the unknown 
depth of a UHECR sample, using the GZK effect. 

Large- Angle Clustering of UHECRs 

We demonstrate what can be learned from the 
large-angle clustering of UHECRs by creating 
mock samples of UHECRs assuming different as- 
trophysical sources and examining their resulting 
clustering. We use the Sloan Digital Sky Sur- 
vey (SDSS) [7] to create a volume-limited sam- 
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pie of galaxies that is complete out to a distance 
of 286Mpc. We select a sample of massive galaxy 
clusters in the same volume taken from a SDSS 
group and cluster catalog [2]. Based on their lumi- 
nosities, we estimate these clusters to have masses 
greater than lO^^/i^^M©. We then measure an- 
gular cross-correlation functions of each of these 
samples with the galaxy sample (so, for the galaxy 
case, we are measuring the autocorrelation) using 
the Landy-Szalay [4] estimator: 
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where Njj^d^ is the number of pairs as a function 
of 9 between the two data samples (in this case, 
galaxies and something else), Nd^r and Nd^R 
are the number of pairs as a function of be- 
tween each data sample and a random sample, 
and NuR is the number of random-random pairs. 
Figure 1 shows the resulting angular correlation 
functions: cluster-galaxy, galaxy-galaxy, as well as 
the random-galaxy case. As expected, the cluster- 
galaxy correlation function has a higher amplitude 
than the galaxy-galaxy correlation function on all 
angular scales, and the random-galaxy correlation 
function is equal to zero by construction. 

These three curves represent predictions for the 
UHECR-galaxy cross-correlation function in the 
three distinct cases that UHECRs originate from 
astrophysical sources that: (1) live in massive clus- 
ters, (2) live in ordinary galaxies, and (3) are 
uncorrelated with the large-scale structure of the 
universe, such as sources within the Milky Way 
galaxy. The three cases predict different measured 
UHECR-galaxy correlation functions even at large 
angles, where UHECR direction uncertainties due 
to measurement error and magnetic deflections are 
unimportant. 

We next examine how well we can distinguish be- 
tween these different predictions assuming a sam- 
ple of 1000 UHECRs. For the purpose of this test, 
we assume that the sources of UHECRs are, in fact, 
ordinary galaxies. We create a mock UHECR sam- 
ple by randomly selecting 1000 galaxies from our 
SDSS galaxy sample. We create 200 independent 
mock samples in this way and measure their cross 
correlation with all galaxies. The shaded blue re- 
gion in Figure 1 contains 95% (2a) of the mock 
realizations. We then simulate arrival direction 
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Figure 1 : Predicted UHECR-galaxy angular cross- 
correlation functions for the cases that the astro- 
physical sources of UHECRs are (1) uncorrelated 
with the large-scale structure in the universe (black 
line), (2) ordinary galaxies (green curve), and (3) 
clusters of galaxies (magenta curve). The blue 
shaded region shows the 95% (2(j) measurements 
using 200 mock samples of 1000 UHECRs each, 
where the UHECRs are assumed to originate in 
galaxies. The red shaded region shows the same, 
but for mock UHECR arrival directions containing 
3° random Gaussian errors. For this calculation, a 
SDSS galaxy sample of median depth 230Mpc was 
used. 



uncertainties by applying a random 3° Gaussian 
smearing to all our mock UHECRs and repeating 
the correlation function measurements. The red 
shaded region in Figure 1 shows the 95% disper- 
sion for these new measurements. As expected, 
the 3° smearing drastically reduces the correlation 
function at small angular scales, but has a neg- 
ligible effect on scales larger than ^ 2°. Fig- 
ure 1 shows that with a sample of 1000 UHECRs, 
the measured clustering at large angles (> 4°) 
alone can easily distinguish between the "cluster", 
"galaxy", and "random" hypotheses. 
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What Kind of UHECR Sample Do We 
Need? 

Although the sample of 1000 UHECRs used in 
Figure 1 is large compared to current available 
samples, the sample depth in the above illustration 
is also large (median depth=230Mpc). The angu- 
lar clustering will have a higher signal in shallower 
samples because each angular bin will mix in fewer 
uncorrected pairs, so we can get away with smaller 
UHECR samples in shallower volumes. We ex- 
plore this in Figure 2, where we show the signal-to- 
noise (S/N) of a measured UHECR-galaxy cross- 
correlation on large angular scales (6 — 8°), as a 
function of sample size A^cR and depth. In or- 
der to calculate this, we do the same sort of mock 
UHECR analysis as in Figure 1, but using galaxies 
from the 2MASS survey [3]. 

Figure 2 shows that if we want a S/N=3 (99.7% 
significance) detection of UHECRs clustering like 
ordinary 2MASS galaxies, we need 40 UHECRs 
of median source-distance dmed = 50Mpc, or 
A^CR = 80 with dmod = 80Mpc, or A^cR = 160 
with dmod = llOMpc, or A^cr = 320 with 
rfmcd = 150Mpc. 

We now return to the issue of the unknown depth 
of a given UHECR sample. Fortunately, the GZK 
energy loss phenomenon provides a way to put a 
limit on the depth of a UHECR sample. The rapid 
variation with energy of the energy loss means that 
an ensemble of UHECRs of a given energy has a 
rather well-defined horizon within which they are 
produced. If we assume that the energies of UHE- 
CRs are well determined, we can use the GZK 
effect to solve for the distance distribution of a 
UHECR sample, given an initial energy spectrum 
of cosmic rays. Assuming an i?^^-^ energy spec- 
trum, we compute the median depth of an UHECR 
sample as a function of its lower energy cutoff, 
and show the result in Figure 3. We can now 
use Figure 3 to connect the sample depths shown 
in Figure 2 with energy cutoffs for UHECR sam- 
ples. In our S/N=3 example, the required samples 
would have 40, 80, 160, and 320 UHECRs with 
energies above 90EeV, 56EeV, 45EeV, and 37EeV, 
respectively. These samples are larger than cur- 
rently available samples from AGASA-nHiRes, but 
should be available in the near future by the Pierre 
Auger experiment. 
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Figure 2: Estimated signal-to-noise (S/N) for mea- 
surements of the UHECR-galaxy cross-correlation 
function on large angular scales (6 — 8°) as a func- 
tion of median sample depth and size of UHECR 
sample. This calculation was done using 2MASS 
galaxy samples of various sample depths, and as- 
suming that UHECRs originate from these same 
galaxies. Different colored curves represent differ- 
ent size UHECR samples, as listed in the panel. 
This plot answers the question: At what signifi- 
cance can we detect the cross-correlation between 
UHECRs and 2MASS galaxies at large angular 
scales, if we have a UHECR sample of size A^cR 
and a given galaxy and UHECR sample depth? 
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Figure 3: Median depth of a UHECR sample as a 
function of its lower energy cutoff, assuming that 
the probability distribution of distances for a single 
UHECR is given by the GZK effect. This calcula- 
tion was done assuming a UHECR energy spec- 
trum of £'^^'^. For each energy threshold, a total 
distance distribution was computed by weighting 
the probability distributions of individual energies 
by the overall energy spectrum. The sample depth 
decreases with energy because of the GZK effect. 
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